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Abstract. We present novel analytical results about ecosystem species diversity that 
stem from a proposed coarse grained neutral model based on birth-death processes. 
The relevance of the problem lies in the urgency for understanding and synthesizing 
both theoretical results of ecological neutral theory and empirical evidence on species 
diversity preservation. Neutral model of biodiversity deals with ecosystems in the same 
trophic level where per-capita vital rates are assumed to be species- independent. Close- 
form analytical solutions for neutral theory are obtained within a coarse-grained model, 
where the only input is the species persistence time distribution. Our results pertain: 
the probability distribution function of the number of species in the ecosystem both in 
transient and stationary states; the n-points connected time correlation function; and 
the survival probability, defined as the distribution of time-spans to local extinction for 
a species randomly sampled from the community. Analytical predictions are also tested 
on empirical data from a estuarine fish ecosystem. We find that emerging properties of 
the ecosystem are very robust and do not depend on specific details of the model, with 
implications on biodiversity and conservation biology. 



1. Introduction 

Statistical physics is decisively contributing to our understanding of ecological processes. 
In fact it is providing powerful theoretical tools and innovative steps towards the 
comprehension and the synthesis of broad empirical evidences on macro-ecological laws 
and emerging biodiversity patterns [U 121 El IH E] • One field where statistical physics has 
been particularly successful is the neutral theory of biodiversity [El El El [101 [HI [EJ [13] . 
This theory is based on the assumption that, within the same trophic level, per-capita 
vital rates are species- independent. It offers a unified theoretical framework for ecosystems 
dynamics by invoking solely basic ecological processes such as birth, death, migration 
and dispersal limitation [51 E|. Exact solutions have been found for several ecologically 
relevant quantities, such as: the relative species abundance distribution (RSA) [9l [T0l[H] ; 
species spatial patterns and clustering [T21 H3]; patterns of /3-diversity (i.e. intra- and 
inter-species spatial correlation) [151 HE]; species area-relationship (SAR) (HIE]; and 
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species' persistence time distributions [191 EDI EI]. These results enabled the theory 
to be tested contrasting empirical data, and to assess the power of neutral models in 
predicting ecological patterns in many ecosystems. 

Let consider a given trophic level (i.e. plants) in a local ecosystem. "Local" means 
a community immersed in a larger one, called met a- community, that is considered as 
infinite in size with respect to the local community (serving as a reservoir). We assume 
that each individual in the local community has a natural death rate d and birth rate b 
(linear birth and death rates). Moreover, due to speciation or diversification events (i.e. 
immigration from an outside region) new species (i.e. species not already present in the 
local community) enter in the system with rate A. 

A general formulation of the neutral theory employs the birth-death master equation 
(ME) for the abundance dynamics of a species in the ecosystem [HI [III 123] • Several 
exact results are known for birth-death models where the population, N, of the local 
community is strictly conserved and/or the corresponding A scales as N (e.g see the 
infinite alleles model with mutation [T7J or the voter model [251 EEl l27j). 

In particular, a very well known and studied pattern in ecology is the RSA of the 
local community, defined as the fraction of species of a given abundance. It describes key 
elements of biodiversity, as the frequency of rare species in the ecosystem |6l [JJ . Another 
important quantity in conservation ecology is species richnesses, the number of different 
species in the community, independently of their abundance. Analytical solutions for 
both of these quantities can be found in the literature [H [TTJ, [T7J [28] . 

Unfortunately, transient solutions for species richness and its n points time 
correlations functions (i.e. the probability of having Si species at time t±, S 2 species 
at time t 2 , etc..) are not easy to calculate using this approach. Moreover, higher 
complications comes into play if we want to generalize the dynamics with non linear birth 
and death rates, i.e. birth and death rates which are not proportional to the population 
size |24J. For this reason, in the next section we introduce a coarse grained version of 
this model which will allow us to deduce in a simpler way novel analytical results about 
ecosystem biodiversity, where the only input is the species persistence time distribution. 
We note that, given the assumption of no interaction among species, no approximation 
is made to obtain these results, but they are limited to knowing whether a species is 
present /absent, as opposed to knowing how many individuals there are of that species 
at a given time. In particular we obtain close-form analytical results for: the probability 
distribution function (pdf) of the number of species in the ecosystem in transient states; 
the n— point connected time correlation function; and the survival probability, p s (r), 
defined as the pdf of time-spans r to local extinction for a species randomly sampled 
among the observed assemblages at a certain given time. A comparison of analytical 
predictions with empirical data from a estuarine fishes ecosystem will be carried out. A 
set of conclusions closes then the paper. 
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2. Theoretical Framework 

2.1. Grand- Canonical Formulation of Neutral Theory 

In the natural/realistic case where the total number of species (S) and individuals (N) 
in the local community is not fixed, we can described the abundance dynamics of the 
species in the system employing a "grand-canonical" formulation of the neutral theory. In 
particular, if 0j gives the number of species with j individuals, and = {(f>\, </> 2 , 0oo}? 
then the probability P(0, t) - of having at time t, <f>\ species with one individuals, 02 
species with two individuals, and so on - is univocally described once the initial condition 
P(4>, t = 0) is known and the transition rates from 0' to due to birth/death or speciation 
events are given. In particular, for k > 1 with birth rate bk = b ■ k ■ 0^ (birth event 
within species of k individuals) we have the transitions {0& — > 4>h ~ 1; 0fc+i - > 0fc+i + 1}, 
while with death rate dk = d ■ k ■ <ftk (death of an individual belonging to species with 
k individuals) we have the transitions {<pk — > 4>k — 1; 0fc-i 0fc-i + 1}- Finally with 
rate A the transition {0i — > <f>\ + 1} occurs. We remark that in our framework, the total 
number of individuals N is not fixed and the rate at which new species enters in the 
system is independent of N. 

In our grand-canonical formulation, the stationary solution of the master equation 
corresponding to V((j),t) can be written as P staz (0) = -| Ylk^i^k), with 

where 7 = X/b and x = b/d. Note that x represents the ratio of effective per capita 
birth and death rate and if A 7^ it has to be less than 1 in order to avoid demographic 
explosion. On the other hand if A = 0, then at equilibrium there are no individuals in 
the community because all species eventually go extinct [HI E] ■ 

In our theoretical framework the RSA is deduced by the first moments of V s taz, 
while species richness is described by the probability of having s different species in 
the community at stationarity, that turn out to be Poisson distributed with mean 
(S) = -7ln(l-:r): 

V(s)= J2 nW K (^-s)= { - lH ^- X)) \ l-xy (2) 

01,02, — ,4>oo j 

where 5k is the Kronecker delta, which is 1 when its argument zero and zero otherwise. 

Since to maintain a finite local community size, x must be less than one, then each 
species is eventually doomed to extinction. We thus can define the persistence time r of 
a species as the time incurred between its emergence in the system (due to diversification 
or immigration) and its extinction [TT?| [201 EI] ■ In particular, persistence times for a 
species that undergoes the linear birth-death dynamics presented above, are distributed 
according to the species persistence time (SPT) pdf [19J, 

2 

e d(l-x)t_ (3) 



Ps P t{t) = d 



1 — X 



e d(l-x)t _ I 
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From now on, without loss of generality, we set d — 1. From Eq. (|3]) we can express the 
mean persistence time, a key ecological quantity also known as mean extinction time 
[T9l IT71 128] . as (r) = J °° rp spt (r)dr = — ln(l — x)/x. Finally, as expected, the mean 
number of species at stationarity is related to (r), through Eqs. §2§ and (|3]), by the 
simple formula (S) = A(r). 



2.2. The Coarse- Grained Model 

A coarse-grained view of the grand-canonical formulation of the neutral theory can be 
described as follows. Each species % within the local community emerges at a random time 
{ti}i>o with a rate A, that is, the probability a new species emerges in the infinitesimal 
time interval dt is Adt. Therefore the probability of having k new species in the ecosystem 
up to time t, £4(t), is given by (see Appendix A): 

™ - ^ i S- (4) 

s =l 

where so is the number species at time t — 0, i.e Uk(t = 0) = 5x(k — s )- Then the 
newcomer species % persists for a random duration Tj where the random variables r, are 
independent and identically distributed with a given pdf p sp t(r). Therefore species arrive 
as a Poisson process (with rate A), and then depart after some non-Poissonian waiting 
time t ~ p spt . 

This model is coarse grained in the sense that we do not take explicitly into account 
information on species abundance, implicitly contained in the functional form of p sp t{ T ) 
(see Figure [T]) . This approach enable us to go beyond Eq. ^ , and to generalize these 
results also for general birth and death rates, by taking different functional shape of the 
SPT pdf. We remark that the mapping is exact only with respect to the grand-canonical 
formulation of the neutral theory, while, as we will show, is only a good approximation 
of the classic birth-death ME approach in the limit of large local community (N fixed 
and N — > oo). 

Within our framework, the number of persistent species in the ecosystem at a given 
time t (see Figure [2]) is given by: 

k(t) 

S , (t)=^9(t i + r i -t), (5) 
i=i 

where k(t) is the number of new species that entered into the system in the time interval 
[0, t), i.e. U < t for i < k(t) and U > t for i > k(t). Q(z) is the Heaviside step function, 
which is 1 when its argument is positive and zero otherwise, tj is the time when the i-th 
species enters into the ecosystem and ti + r« when it extincts. The probability of having 
s species present in the ecosystem at time t is thus V(s,t) = (Sk(s — S(t))), i.e. 

+oo „i k j, k i k 

V{s, t) = J2 Mt) / II ~T / H dT jPspt(Tj)S K \ s-J2 ®(U + n-t) 

k=0 Jo i=l 1 Jo 3=1 \ i=l 
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It is useful and customary to define the generating function of the process (the discrete 
Laplace transform), V(z,t) = zS ^P{ s ^ an d analogously for £4(t). Eqs. Q and 
(J6| lead to (see Appendix A) 

V(z, t) = U(l + ^±f(t), Oy^-V, (7) 

where 

pt p+oo 

f(t)= / drP > (r)= / Pspt (T)mm[t,r]dr, (8) 
Jo Jo 

with P>(£) = p S pt( T )- If) for example, we assume that C/ s (0) = Sk(s — 1) (and 
therefore U(z, 0) = z), then from Eq. ^ we get 



SI 

One can see that f(t)/t = J °° p spt (r)mhi(l, r/t)dr — > in the £ — >• oo limit. This 
follows from min(l,r/£) < 1, lim^oo min(f , r/t) = for all r and the Lebesgue's 
dominated convergence theorem (see for example [2H])- Since U(z,0) is continuous 
in the interval -1 < z < 1 we have U((l + (z - l)/£)/(£),0) -> Z7(l, 0) = 1 in 
the large time limit, implying that the initial condition is forgotten in this limit, as 
expected. If / °° p sp t(r)TdT = (r) < oo, then lim^oo /(£) = (r) and from Eq. Q we get 
V(z) = lim^oo V(z, t) = e^ l ~ z ^ T \ or equivalently 

V{s) = lim V(s, t) = i^ll!e- A<r) , (10) 

t— >oo si 

i.e. a Poisson pdf with mean given by the product between the species emergence rate A 
and the mean average persistence time. As expected, we found the same result given by 
Eq. @. 

FromEq. Q we can easily calculate the average numbers of species at a generic time 
t. Indeed, if (S) t =o = J2 S s U s (t = 0) < oo, we have that U(z, 0) admits left derivative at 
z = 1 and taking the left derivative at z = 1 of Eq. ([7]) we obtain 

(S)t = §-V(z, t) lz=l = (S) t J-f- + Xf(t) *3° A(r), (11) 

where again we see that the initial condition is forgotten in the large time limit. These 
results have been tested numerically (see Figure pj). 

3. Generating Function Approach 

We are interested in the n— point connected time correlation function of our process (or 
cumulant, see j3U] ) (5'(^ 1 )5'(^ 2 )"""5'(^ n )}c i n the stationary conditions (mirij =li ... )n £j — > oo) 
and t l+1 — t % fixed, for % — 1, n — 1. The generating function is given by 

+oo k 

■ "I, I TTT , 

X 



pi 2 J-t p+oo 

Mih}) = (e~x d ^)=Y,Uk(T) / n? / n^w^: 

k=0 J ° i=l 1 J ° j=l 

k „ T 

x exp{- / dt'h(t')Q(ti +n- t')Q(t' - U)} = 
i=i Jo 
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J2Uk(T)z(T) k = U(z(T),0)e 



\T(z(T)-l) 



(12) 



A:=0 



where 



z(T) = £ |(exp { - J T dt'h(t')Q(t + r - t')Q(t' - t)})^ 

with (-) T = J °° dr ■ p spt (r). 

The n— point correlation function is obtained by choosing 

n 

h{t) = ^25{t- f)hi with hi > 



(13) 



i=l 



and 



d" 



(14) 



(15) 



lim (S(t 1 )S(t 2 )---S(t n ))c = lim ^ ^- In [^(W)] U _ , 

where we assume < t 1 < t 2 < ■ ■ ■ < t n < T. 

We will show that in the stationary conditions i) lim^oo z(T) = 1 and ii) if 
(t) t < 00, 

d n 



(Sit^Sit 2 ) ■ ■ ■ S(t n )) c = lim AT 



t->oo dhi ■ ■ ■ dh n 



(16) 



Using Eqs. (13) and (14) we obtain 



z(T) 



/ ^(exp{-^^e(t + r-f)e(f-t)}) r 

J ° 1 i=l 

n 

] [ [(e-^ - l)Q{t + r - f )6(f - t) + 1 



dt 1 
T\ 



(17) 



/ r £ n — maxjO, t %k — rj , \ J\ , h . , . 

e ( max {°' — ^ — ->) T ri( e j - !). 

fe=l ii<i2<---<ik j=l 



where we have expanded the product on the r.h.s. of Eq. (17) as n*Li(l + v i) = 
1 + Er=i^ + Er i <i 2 a; n^2 + --- = 1 + Yl n k=iJ2i 1< i 2< -i k Vi 1 Vi 2 ■ ■ ■ v lk . Using the Lebesgue's 
dominated convergence theorem, the average in Eq. (18) tends to zero in the T — > 00 
limit and thus limj^oo z(T) = 1, which proves i). Let us consider now 

n k 

lim T(z{T)-l) = km ^ E (max{0, f ^maxjO, f fc -r}}) J]/" 



T-!.oo z — ' z — ' \ 
fc=l «i<«2<---<ifc 



e -'s -i . 



Since max{0,t l1 — max{0,t* fe — r}} < r and lmifi^oo max{0, i n — max{0,t ifc — r}} 



max{0, r — (t* ft — t* 1 )}, again for the Lebesgue dominated convergence theorem, Eq. (19) 
leads to 

n k 



lim T(z(T)-l) = J2 E [I^- 1 ) W ' 7 "^-^)^ 5 ( 2 °) 

fc=l ii<i2<---<ik J=l 



and therefore we have that Eq. (12) becomes 



lim Z T ({/>}) = e 

T— >-oo 



A3(W) 



(21) 
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Using Eqs. (15) and (21) we finally prove Eq. (16). In particular, we note that in the 
large time limit, the n— point connected time correlation function 

hm (S(t l )S(?) ■ ■ ■ S(t n )) c = A([r - (t n - ^)]e(r - (t n - t l ))) r (22) 

t 1 — s>oo 

is independent of t 2 ■ ■ ■ t n ~ l G (t 1 , t n ). 



4. Survival times pdf 

The survival time r s is defined as the time to local extinction of a species randomly 
sampled among the observed assemblage of species at a certain time T (see Figure [2]). 

We can express t s as a function of the random variables to (emergence time of that 
species) and r for which the pdf is known: 

Ts = t + r-T if < t < T and t + r > T. (23) 

We then express the survival time (ST) pdf conditional to a persistence r as: 

p s (t\r) = C(5(t - (t Q + r - T))6(t + r - T)6(T - t )e(£ )>, 

where the constant C ensures normalization. Solving the ensemble average operators 
yields: 

p s (t\r) = CQ(t - t)Q(t - r + T)6(t), 

and, by marginalizing over r, we obtain the ST pdf: 

1 f t+T 

^ )= (min(r,T)U (24) 

Without loss of generality, it can be assumed that T is not affected by the boundary 
condition in t — 0, i.e. T — > oo and = -A- f t p S pt( r )dr. Particularizing now to the 
case of persistence distributions for linear birth-death processes (Eq. (|3])), the survival 
pdf asymptotic behavior is: 

* w « f ,-((i-„ T /(i-e-<-^,)^-<-.* cc ~ { ; << (2S 

where £* = (! — a;)^ 1 . 



5. Applications and Comparison with Empirical Data 

5.1. Gamma Species Persistence Time Distribution. 

Recent results |20| |2T] have shown that SPT distribution for several different type of 
ecosystems exhibit power-law behavior with an exponential— like cut-off: 

p spt (t) = A' 1 t- a e-^%t - r ), (26) 

where A = (1 — x) a ~ 1 r(l — a, (1 — x)t ) is the normalization constant, and T(a, b) is 
the incomplete Gamma function. The exponent a of the power law is suggested to 
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depend on the spatial structure of the embedding ecosystem. SPT distributions exhibit 
progressively smaller scaling exponents a for increasing constraints in the connectivity 
structure of the environmental matrix [20j. Specifically, numerical simulations [20] show 
that the exponents range between a = 2, that corresponds to the case of global dispersal 
(mean field), a = 1.91 ± 0.01 for a 3D lattice, a = 1.83 ± 0.02 in a savannah (2D lattice), 
a = 1.64 ± 0.02 for river network topology, up to a = 3/2 for ID systems. 



For the functional shape of the SPT distribution given by Eq. (26) with a < 2, the 
following asymptotic limits are obtained: 

/oo 
p spt (t')dt' = (r) T p s (t) = (27) 

r (1 - a ,( 1 i-, )T / ' (Mt (( 1 " ^)~ a (! - + o (^y) 2 , for t » t* 

(To/t) a -\ for T < t < t* 

while -P>(t) ~ 1 for t — )■ 0. The average persistence time follows from Eq.(|8]) 

(T)= r^>w=/oo= T ^-;' (1 -: ) :° ) . (28) 







£ Q ((1 - x)t ) 

where E n (z) = e~ zt /t n dn is the exponential integral function. Setting tq = 1 for 
simplicity in Eq. (pBl, the corresponding two point correlation function derived from 



(22) is: 



A F(2 - a,t(l - x)) - t(l - x)T(l - a,t(l - x)) , . , 

Therefore, the covariance is a decreasing function of t, that for large times goes to zero, 
i.e., lim^ +00 cov s {t) ~ e-^H- a = 0. 

5.2. Scale-Free SPT distributions 

An interesting case is obtained when x — > 1, i.e. the system is in the scaling regime 
where it does not exhibit a characteristic time scale. This is typically reported when 
SPT of families or genera - as opposed to species - are considered, possibly measuring 
them from the fossil record |31j . Thus, longer time scales and long time— series must be 
considered. 

Under such assumption, the normalized SPT pdf reads as: 

p sp tit) = («-!) T«-H- a 6(t - r ); a > 1, (30) 

whereas the cumulative SPT distribution is P>(t) = (y) Q_1 6'(t — r ) + 6>(r — t), and 
from Eq.Q we obtain: 

f/ t \ = J min bo,t] + r 9(t - t ) {t/Tl f_™~ 1 , a ^2; ^ 
min[r ,t] +r 6(t - r ) In (t/r ), a = 2. 



For this case, and in the limit T>t, the ST pdf given by Eq. (24) becomes 
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We note that two different cases can be pointed out. 1) V s tat(S) exists and depends 
on a microscopic time-scale r . This imply a > 2, leading to \im t ^ +oc f(t) = Tofz^ = /oo- 
2) For a < 2 the stationary pdf V s tat(S) does not exists. In fact in this case 
/(*) ^ ^) 2 ~ a oo for a < 2 and /(f) = r ln(5^) *3° oo (e is the 
Neper number). The two point correlation function is thus obtained as: 

cov s (t) = A (a - 1) r*- 1 / r- a {r-t)dr={ A <*-2 J ' a > Z ' (33) 



oo, a < 2. 



leading to lim t _> +00 cov5(t) 



0, a > 2; 



+ oo, a < 2. 



5.5. Hinkley Fish estuarine Database 

To test the results of the coarse grained neutral null model, we employ a long-term 
monthly database of a estuarine fishes ecosystem. Fish samples were collected from the 
cooling-water filter screens at Hinkley Point Power Station, located on the southern 
bank of the Bristol Channel in Somerset (England). The data span the period from 
January 1981 to January 2010. A full description of the intake configuration and sampling 
methodology is given in [32] • A matrix P is built using presence- absence records for each 
months during the 29 years. Each element P s t of the matrix is equal to 1 if species s is 
observed during month t, otherwise P st = 0. The empirical persistence time are defined 
as the number of consecutive months in which the measurements reveal the presence of 
the species. The presence- absence time series thus form a vector of length T, where T is 
the total number of moths of observation. 

Persistence time is defined as the length of a contiguous sequence of Is in the 
time series. From such time series we can thus reliably estimate the empirical average 
persistence time (r) and the species emergence rate A. For the analyzed fish estuarine 
ecosystem we find (r) = 3.02 [month] and A = 4.83 [month -1 ]. Moreover, by summing 
over rows of the matrix P st , we obtain the total number of observed species on month t, 
i.e., St = ^2 s Pst ■ We can thus calculate (assuming stationarity) the distribution of the 
number of persistent species in the system (Figure^), its first moments, S = Ylt=i $t/T 
(Figure [3]d), and the empirical two-point correlation function (Figure [1]): 

p(At) = ^=Am T (^-^> {St-At-S) 
where At is the time lag and cr| = ^ Y^=i {St ~ S) 2 , so that p(0) = 1. 



The average number of species predicted by the model can be obtained by Eq. (11), 
(S) = A(r) = 14.6, whereas the standard deviation is \/Mt) = ±3.8 that is in good 
agreement with the observed ecosystem diversity S ±o~s — 14.7 ±3.7 calculated from the 
presence-absence matrix P st (see Figure [3h) . The null hypothesis of a Poisson species 



distribution given by Eq. (10) is accepted by both the Kolmogorov-Smirnov test and 
the x 2 test within a 95% confidence interval (P va iue = 0.05). This analysis suggests that 
species diversity data cannot be used by themselves to discriminate among different 
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mechanisms of demographic growths, i.e type of birth-death processes. Similar conclusion 
have been achieved using RSA data [21]. 

The autocorrelation reveals a relevant periodic behavior in the species times series 
(Figure inset Ilk). To investigate such periodicity we study the power spectrum of the 



time— series S t = S t — S t , i.e. "B[oJj\ = ^f\x(^j)\ 2 , where — St=i S t e Wit is the 

Fourier transform of St and Uj = 2jir/T (j = 1,2, „T) is the frequency. The spectrum 
exhibits a peak for j « 30, indicating that the period is t p fa 12 months (see Figure 
[4^,) . Therefore the periodicity is a trivial effect due to seasonality of weather patterns 
and is not related to the fluctuations or intrinsic noise of the system dynamics |33j. 
To test Eq. (29) against empirical data, we then smooth out the periodicity due to 
seasonality. In order to do that, we calculate the empirical autocorrelation function only 
considering a specific month for every year and then averaging over all twelve months, 
i.e., {At) = E£Wi M J ( S l - S 3 ) (Sl At - Si) /((Mj - At)(a])), and p = Pi/12, 
where j = {Jan, Feb, Dec} and Mj is the total number of j— type month in the whole 
time series. As can be seen from the inset in Figure |2Jd, once we remove seasonality Eq. 
( 29 ) describes well the autocorrelation function p. 

An analysis of the fish SPT distribution have been also carried out. We find that 
SPT distributions are well described by the Gamma SPT with parameters compatible 
with a fa 2. and x = 0.9999. This result holds with particular accuracy only when large 
SPT are considered. In fact, at monthly time scales, seasonality affects short SPTs, 
increasing the slope of the first part of the SPT pdf. Due to the negligible effect of 
the cut-off at the monthly time scale, we find that also a power-law SPT with a ~ 1.9 
fits the empirical SPT pdf. Note that here we are only interested in the estimation of 
the input parameters of the coarse grained neutral model, (r) and A, as they can be 
calculated directly from the SPT measured time— series without the need of fitting the 
entire SPT distribution. 



6. Robustness of the results 



6.1. Agreement with mean field voter model results. 

As we claimed in section 2, the proposed model is a coarse grained view of the grand- 
canonical version of neutral theory. Nevertheless we now show that our coarse grained 
model give a good approximation of results stemmed from the mean field approximation 
of the voter model with speciation [TBI EE] or, equivalently the infinite alleles model 
with mutation [T7] (for details on the mapping between these two models we refer to 
[27]). The scheme of these models is the following. Consider a local community of 
N individuals. At every time step a randomly selected individual in the ecosystem 
dies. With probability 1 — v the individual is replaced by a colonizing offspring of 
another individual, randomly selected within the ecosystem whereas with probability v 
the offspring belongs to a new species. For N fixed and N — > oo the above dynamics is 
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described a linear birth-death ME with b — l — v and d — 1 (and thus x = b/d = 1 — v): 

= a:(n-l)pW 1 (*)+(n+l)pJl 1 (t)-(a;n+n)pW(t) forn > 1 (35) 



where i = 1, 2, £ < N and p„ (i) is the probability for the i th species of having an 
abundance n at time t. For the hypothesis of neutrality we have that Pn\t) can be 
expressed as: 

p^(t)=p* n (t-t l ), 

where p* n (t) is the probability for a single species to have abundance n at time t 
after its emergence (under the neutral assumption p* n (t) is species invariant). The 
average population of the i -th species, given that it is still present at time t, is 
(n>i) = E n >i n Pn\t) / E m >i Pm (t). The mean population size at time t for the mean 
field voter model can be calculated as: 

,„,,« = / E,. ( :'srop;(i-t.)'n (Ef. ( ;'£rop:(t-t» 

{ ( " ^ E2? ES a(« - «.) ; ~ < Efi? ES *(« - «.)> ' 

where -D(t) is the number of diversification events occurred until time t and the ensemble 
average is over the random variables £j and D(t). By using the fact that for all i, ti is 
Poisson distributed with the same frequency A and Unit) is the pdf of the D— variable (see 
Eq. (IB), then equation ffl simplifies to (n(t)) = /* tft' 'E„«Pn(*') / £ rft 'E n >iK(0- 



Thus, from the definition of p* n (t), it follows that: En n Pn.(*0 = or ' the 



mean population of a species after a time t from its emergence. Using Eq. (35) 
it obeys the deterministic equation d(n*) t /dt = —(1 — x)(n*) t , which solution is 
(n*) t = (ra*)oexp(— (1 - x)t) [19] . Thus we have that / *(n*)o exp(— (1 - x)t')dt' = 
(n*)o (1 — exp(— (1 — x)t))/(l — x). We observe that J2 n>1 Pn(t') is the probability that 
the species has more than one individual at time £, that is the cumulative distribution of 
the SPT pdf P > (t) = Yl< n> iP* n {t) = f^°° Pspt( T )d r and therefore 

/ dfyV(f)= / rff / p vt (r)dr=(min(r,t)) T = /(t). (37) 
Jo n > x Jo Jt> 

Using the above relations and (n*) = 1 (since by definition p* n {t = 0) = Sx(n — 1)), we 
obtain 

from which it follows that (S) = iV/ (n) = iV(l — x)(t). Therefore if we approximate N 
with the average number of individuals in the corresponding grand-canonical ensamble 
Eq. @, i.e. (N) = xd x \nZ = A/(l — x), we found that Eq. d38J) is the same result we 



have obtained from the coarse grained neutral model (compare with Eq. (11). 



Interestingly, it has also been found that, for linear birth and death processes, the 



survival probability function has the same asymptotic functional shape given by Eq. (25) 



[19] . In fact, by assuming an initial population distribution given by the Fisher log-series 
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T^RSAin) = — x n / (n\og(l — x)) [8J, and denning p sp t(t \ n ) as the probability that a 
species starting with no individuals is still present at time t, it has been shown that (19] : 



Ps(t) = J2 *V(* I n o)^RSA(n ) = | J ^ ^ (39) 

no=l I ' 

that is, it has the same asymptotic behavior of the ST pdf as obtained in Eq. (25) from 
the coarse grained model. 

6.2. Universal relation between persistence and survival distributions 

The fact that the survival probability function has the same asymptotic behavior in two 
different neutral models suggests that, rather than by chance, it possibly happens as a 
consequence of a deeper, and more general relationship between the SPT distribution 
and the ST probability function, valid regardless of the specific birth and death processes 
assumptions or, in other words, independently of the functional shape of b(n) and d(n). 

To address this issue, we start by calculating the RSA at a stationary time T 
(with absorbing boundary condition in n = 0), assuming that each species has only one 
individual when it emerges. Regardless of the type of birth/death rate, such relation can 
be written as: 

1 f T 

V a ( n )= lim = / duP(n,T-u 1,0) = 



M 7m / duP(ji,u\l,0). (40) 

T^+oo f(T) 







where P(k,t\n,to) is the probability to find k individuals at time t given that there are 



n at time to < t. The normalization has been calculated using Eq. (37). 
The ST probability function is thus: 

oo oo j -t 

Ps(t) = Y J Ps P t(t\n)V a RSA (n) = T hm J]-P(0,*|n,0) A / duP{n, u\l, 0). 

n=l ~* °° n=l ^° 

Being at stationarity, time-translational invariance P(n2, ^l^i, ti) = P(^2,^2 — 
ti\ni, 0) holds: 

1 f T d °° 

p s (t) = lim — — / — } P(0,t\n,0)P(n,0\l,-u)du 



T— > 



T— » 



I f d 

lim 777V / ^y"^(0,tK0)P(n,0|l,-M)d- 

lim 7777/ l^(0,t|l,-^)-P(0,t|0,0)P(0,0|l,-n)]rfn 

77k/ -^(<M + u|l,0)du= lim -J- / dP(0 



t + ttll, 



= t K ? 77^^>W1 ; 0)-P>(t + T|l,0)], 
t^+oo / (T) 

where P(0,t|l,0) = 1 -P>(t|l,0) and P>(t|l,0) = P(n, t|l, 0) = I t °° Ps P t(r)dr. 

We thus obtain 

1 f°° 

Ps(t) = 7^ / p S pt(r)dr, (43) 
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and this relation is valid in general, independently of the specific master equations obeyed 
by P(n,f|l,0). 

7. Conclusions 

In this paper we have proposed a neutral model for generic birth-death processes, where 
information on species abundance is subsumed by the SPT distribution, p spt , associated 
to the ecosystem under study. The model can be seen as a coarse grained version of the 
grand-canonical approach to neutral theory. This framework has two main advantages: 
1) Allowed us to obtain analytical results as the transient (and stationary) dynamics of 
ecosystem species richness, as well as the complete analytical description of the n— point 
correlation function of species diversity; 2) Provide a simple null model, incorporating 
all the main features of any neutral model based on birth-death processes, as a function 
of one apriori distribution, namely the SPT distribution. This highlights the important 
role of p spt , as synthetic descriptor of the ecosystem dynamics. One of the main results 
obtained is given by the relation eq. (41), valid for any birth and death process, between 
the persistence time distribution p spt and the survival probability function p s . All the 
presented results are exact in the assumption that species are noninteracting. We have 
compared the analytical results of our model with empirical data on species diversity for 
a estuarine ecosystem. The analysis shows that, in spite of the minimalist assumptions 
of our model, complex emergent patterns in the ecosystem dynamics can be captured 
by the proposed coarse grained neutral framework. This suggest that species diversity 
data cannot be used by themselves to discriminate among different type of birth-death 
processes. Further studies and different approaches are required to determine how 
species diversity patterns are related to different type of demographic dynamics. Two 
major simplifications of our analysis are the non-interacting ideal gas like assumption 
and ignoring the effects of the spatial distribution. Further research will probe what 
qualitative changes would arise by relaxing the mean field-like assumption presented 
here by accounting for dispersal limitation. 
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Figure 1: a) Coarse grained view of ecosystem species dynamics. Species emerge in the 
ecosystem uniformly in time, and each persists for a random time r drawn from the SPT 
pdf p sp t- 




Figure 2: Schematic representation of survival times t s , defined as the time to local 
extinction of a species randomly sampled among the observed assemblages at a certain 
time T. r, instead, denotes the persistence time of a species, and it is defined as the 



time incurred between its emergence in the system and its extinction. Eq. (43) gives the 



relation between the two distributions, independently of the functional form of the birth 
and death rates. 
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Figure 3: a) Comparison between the empirical SPT pdf (histogram) and theoretical 
V(s) given by Eq. ( JIo) ) (blue line), b) Monthly time series of the analyzed estuarine fish 
ecosystem. The black dashed line represent the mean number of species St, while the 
green dot-dashed line the predicted average (S). c) The analytical solution for V(s) has 
been verified numerically. 
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Figure 4: a) Power spectrum (H[tUj]) analysis of the time series St and of the 
autocorrelation function p(At) given by Eq. (34) (in the inset). We find a periodic 



behavior of the time-series with period t p « 12 months, b) We test the analytical 



result of the n— point correlation function (red dashed line) given by Eq. (22) for 
Pspt(t) = "^ e ~^ T ° y i a numerical simulations. The parameters used in the simulation 
are T = 1000, r = 2 and A = 10. Gray squares represents the two point 
{(S(t) — S)(S(t + At) — if?)) function, while green and blue dots are the three point 
correlation function ((S(t) - S)(S(t + a At) - S)(S(t + At) - S)), with a = 1/3,2/3, 
respectively. In the inset: comparison between empirical autocorrelation function p(At) 
(that does not take in account seasonal periodicity in S t ) (black dots) and analytical 



autocorrelation function predicted by the coarse grained model using Eq. (29) with 
(1 — x) — 0.0001, a = 2 and A = 4.83 (gray solid line). The dashed gray horizontal lines 
represent the 5 % confidence interval with respect to p = 0. 



An exactly solvable coarse-grained model for species diversity 17 
Appendix A. 

In this appendix we provide some mathematical details of the results presented in 
the main text. We start by achieving the probability of having s new species in the 
ecosystem during the time interval [0,t), U s (t), given by Eq. (|4]) in the main text. It's 
easy to write the ME for U s (t), U s (t) = \{U s -i{t) — U s (t)), i.e. new species enter in the 
system at rate A. The corresponding differential equation for the the generating function 
U(z,t) = T,7=i z s U s {t) ]aU(z,t) = \{z-l)U{z,t) which leads to U(z, t) = e xt ^U(z,0). 
Assuming the initial condition U s (0) = 5k(s — s ), then U(z, 0) = z s ° and 



oo 



U(z,t) = e- xt z s °e xtz = e~ xt ^ ^^ +k = e~ xt T^v/ ■ (A.l) 

fc=0 ' s=s \ S °)' 



From Eq. (A.l) follows 



(At)* 



exp(— At), for k > sq; 



U k (t\s ) { (*-«)' ^ ™» ; - ^ (A.2) 
(J, for k < so, 



which in turn, by noting that Uk(t) = Y^=i Uk(t\so)U So (0) , leads to Eq. (H). Let's 
derive now the probability distribution of the number of persistent species given by Eq. 

(§■ 

Using Eqs. ^ and ( |A.2[ ), the generating function of V(s,t) reads as: 

+oo oo 

V(z,t)= ^^e(n- So )t/ So (0)e- 



-xt (At)"~ so 

-t n 



X 



ft •* Jj poo "■ 

/nf/ n^^)^ 6 ^^- (a.3) 



i=l " u i=l 

Because of the independence of the random variables ti and Tj we can write 

where we have set ti = £ an d r « = t for i = l,2,...,n and I(z,t) = 
Jo Jo°° drpspt^z ^ 1 '^. Finally, using Y17=o = e * ■> an ^ through the relation 

z e(t +r-t) = I @ (r _ q z + @ (t _ r) ^@ (to _ (t _ r)) + Q(t _ (to + T ^ dtQ 
Jo 

= ztQ(r - t) + (1 - 0(r - £))(zr + t - r) (A.5) 

we have 

Z(z,*) = / ^ / c/rp spt (r)^0(r-t) + (l-0(r-t))(^r + t-r) 
Jo t Jo 

= (z-l) n drrp spt (r) + t(P>(i) - 1)) = (* - l)/(t) - t, (A.6) 
where we have used J* dTTp sp t(r) = — f drTP > (r) = —tP > (t) + J drP>(r) and 

/■t /"+0O 

/(*) = / drP>(r)= / p spt (r)min[t,r]rfr, (A.7) 
Jo Jo 
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Substituting Eqs. (|A.8|) and (|A.6|) in Eq. (|A.4|), together with the observation that 

(A.8) 



©fa - s )f/ so (0)X(z,t) so = U(l + 



s =l 



f(t),o) 



leads to Eq. ([7]) presented in the main text. 
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